1,573 research outputs found

    Understanding and Improving the Latency of DRAM-Based Memory Systems

    Full text link
    Over the past two decades, the storage capacity and access bandwidth of main memory have improved tremendously, by 128x and 20x, respectively. These improvements are mainly due to the continuous technology scaling of DRAM (dynamic random-access memory), which has been used as the physical substrate for main memory. In stark contrast with capacity and bandwidth, DRAM latency has remained almost constant, reducing by only 1.3x in the same time frame. Therefore, long DRAM latency continues to be a critical performance bottleneck in modern systems. Increasing core counts, and the emergence of increasingly more data-intensive and latency-critical applications further stress the importance of providing low-latency memory access. In this dissertation, we identify three main problems that contribute significantly to long latency of DRAM accesses. To address these problems, we present a series of new techniques. Our new techniques significantly improve both system performance and energy efficiency. We also examine the critical relationship between supply voltage and latency in modern DRAM chips and develop new mechanisms that exploit this voltage-latency trade-off to improve energy efficiency. The key conclusion of this dissertation is that augmenting DRAM architecture with simple and low-cost features, and developing a better understanding of manufactured DRAM chips together lead to significant memory latency reduction as well as energy efficiency improvement. We hope and believe that the proposed architectural techniques and the detailed experimental data and observations on real commodity DRAM chips presented in this dissertation will enable development of other new mechanisms to improve the performance, energy efficiency, or reliability of future memory systems.Comment: PhD Dissertatio

    Improving DRAM Performance by Parallelizing Refreshes with Accesses

    Full text link
    Modern DRAM cells are periodically refreshed to prevent data loss due to leakage. Commodity DDR DRAM refreshes cells at the rank level. This degrades performance significantly because it prevents an entire rank from serving memory requests while being refreshed. DRAM designed for mobile platforms, LPDDR DRAM, supports an enhanced mode, called per-bank refresh, that refreshes cells at the bank level. This enables a bank to be accessed while another in the same rank is being refreshed, alleviating part of the negative performance impact of refreshes. However, there are two shortcomings of per-bank refresh. First, the per-bank refresh scheduling scheme does not exploit the full potential of overlapping refreshes with accesses across banks because it restricts the banks to be refreshed in a sequential round-robin order. Second, accesses to a bank that is being refreshed have to wait. To mitigate the negative performance impact of DRAM refresh, we propose two complementary mechanisms, DARP (Dynamic Access Refresh Parallelization) and SARP (Subarray Access Refresh Parallelization). The goal is to address the drawbacks of per-bank refresh by building more efficient techniques to parallelize refreshes and accesses within DRAM. First, instead of issuing per-bank refreshes in a round-robin order, DARP issues per-bank refreshes to idle banks in an out-of-order manner. Furthermore, DARP schedules refreshes during intervals when a batch of writes are draining to DRAM. Second, SARP exploits the existence of mostly-independent subarrays within a bank. With minor modifications to DRAM organization, it allows a bank to serve memory accesses to an idle subarray while another subarray is being refreshed. Extensive evaluations show that our mechanisms improve system performance and energy efficiency compared to state-of-the-art refresh policies and the benefit increases as DRAM density increases.Comment: The original paper published in the International Symposium on High-Performance Computer Architecture (HPCA) contains an error. The arxiv version has an erratum that describes the error and the fix for i

    Role of strain and growth conditions on the growth front profile of InxGa1−xAs on GaAs during the pseudomorphic growth regime

    Full text link
    Theoretical and experimental studies are presented to understand the initial stages of growth of InGaAs on GaAs. Thermodynamic considerations show that, as strain increases, the free‐energy minimum surface of the epilayer is not atomically flat, but three‐dimensional in form. Since by altering growth conditions the strained epilayer can be grown near equilibrium or far from equilibrium, the effect of strain on growth modes can be studied. In situ reflection high‐energy electron diffraction studies are carried out to study the growth modes and surface lattice spacing before the onset of dislocations. The surface lattice constant does not change abruptly from that of the substrate to that of the epilayer at the critical thickness, but changes monotonically. These observations are consistent with the simple thermodynamic considerations presented.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/70448/2/APPLAB-53-8-684-1.pd

    Molecular beam epitaxial growth and luminescence of InxGa1−xAs/InxAl1−xAs multiquantum wells on GaAs

    Full text link
    This letter reports the successful molecular beam epitaxial growth of high‐quality InxGa1−xAs/InxAl1−xAs directly on GaAs. In situ observation of dynamic high‐energy electron diffraction oscillations during growth of InxGa1−xAs on GaAs indicates that the average cation migration rates are reduced due to the surface strain. By raising the growth temperature to enhance the migration rate and by using misoriented epitaxy to limit the propagation of threading and screw dislocations, we have grown device‐quality In0.15Ga0.85As/In0.15Al0.85As multiquantum wells on GaAs with a 0.5–1.0 μm In0.15Ga0.85As buffer layer. The luminescence efficiency of the bound exciton peak increases with misorientation and its linewidth varies from 11 to 15 meV.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/69823/2/APPLAB-51-4-261-1.pd
    corecore